Investigating Verbal Intelligence Using the TF-IDF Approach
نویسندگان
چکیده
In this paper we investigated differences in language use of speakers yielding different verbal intelligence when they describe the same event. The work is based on a corpus containing descriptions of a short film and verbal intelligence scores of the speakers. For analyzing the monologues and the film transcript, the number of reused words, lemmas, n-grams, cosine similarity and other features were calculated and compared to each other for different verbal intelligence groups. The results showed that the similarity of monologues of higher verbal intelligence speakers was greater than of lower and average verbal intelligence participants. A possible explanation of this phenomenon is that candidates yielding higher verbal intelligence have a better short-term memory. In this paper we also checked a hypothesis that differences in vocabulary of speakers yielding different verbal intelligence are sufficient enough for good classification results. For proving this hypothesis, the Nearest Neighbor classifier was trained using TF-IDF vocabulary measures. The maximum achieved accuracy was 92.86%.
منابع مشابه
Text categorization methods for automatic estimation of verbal intelligence
In this paper we investigate whether conventional text categorization methods may suffice to infer different verbal intelligence levels. This research goal relies on the hypothesis that the vocabulary that speakers make use of reflects their verbal intelligence levels. Automatic verbal intelligence estimation of users in a spoken language dialog system may be useful when defining an optimal dia...
متن کاملReal-time, scalable, content-based Twitter users recommendation
Real-time recommendation of Twitter users based on the content of their profiles is a very challenging task. Traditional IR methods such as TF-IDF fail to handle efficiently large datasets. In this paper we present a scalable approach that allows real time recommendation of users based on their tweets. Our model builds a graph of terms, driven by the fact that users sharing similar interests wi...
متن کاملNews Recommendation Using Semantics with the Bing-SF-IDF Approach
Traditionally, content-based news recommendation is performed by means of the cosine similarity and the TF-IDF weighting scheme for terms occurring in news messages and user profiles. Semanticsdriven variants like SF-IDF additionally take into account term meaning by exploiting synsets from semantic lexicons. However, semantics-based weighting techniques are not able to handle – often crucial –...
متن کاملThe Accessibility Dimension for Structured Document Retrieval
Structured document retrieval aims at retrieving the document components that best satisfy a query, instead of merely retrieving pre-de ned document units. This paper reports on an investigation of a tf -idf -acc approach, where tf and idf are the classical term frequency and inverse document frequency, and acc, a new parameter called accessibility, that captures the structure of documents. The...
متن کاملUsing Noun Phrases and Tf-idf for Plagiarized Document Retrieval
This paper describes an approach submitted to the 2014 PAN competition for the source retrieval sub-task [7]. Both independent term and phrasal queries are generated, using either term frequency-inverse document frequency or noun phrases to select the terms.
متن کامل